Text Localization, Enhancement and Binarization in Multimedia Documents

نویسندگان

  • Christian Wolf
  • Jean-Michel Jolion
  • Françoise Chassaing
چکیده

The systems currently available for content based image and video retrieval work without semantic knowledge, i.e. they use image processing methods to extract low level features of the data. The similarity obtained by these approaches does not always correspond to the similarity a human user would expect. A way to include more semantic knowledge into the indexing process is to use the text included in the images and video sequences. It is rich in information but easy to use, e.g. by key word based queries. In this paper we present an algorithm to localize artificial text in images and videos using a measure of accumulated gradients and morphological post processing to detect the text. The quality of the localized text is improved by robust multiple frame integration. A new technique for the binarization of the text boxes is proposed. Finally, detection and OCR results for a commercial OCR are presented.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Survey on Various Approaches of Text Extraction in Images

Text Extraction plays a major role in finding vital and valuable information. Text extraction involves detection, localization, tracking, binarization, extraction, enhancement and recognition of the text from the given image. These text characters are difficult to be detected and recognized due to their deviation of size, font, style, orientation, alignment, contrast, complex colored, textured ...

متن کامل

An Enhancement of Images Using Recursive Adaptive Gamma Correction

The “Adaptive Approach for Historical or Degraded Document Binarization” is that in which Libraries and Museums obtain in large gathering of ancient historical documents printed or handwritten in native languages. Typically, only a small group of people are allowed access to such collection, as the preservation of the material is of great concern. In recent years, libraries have begun to digiti...

متن کامل

Robust binarization of degraded document images using heuristics

Historically significant documents are often discovered with defects that make them difficult to read and analyze. This fact is particularly troublesome if the defects prevent software from performing an automated analysis. Image enhancement methods are used to remove or minimize document defects, improve software performance, and generally make images more legible. We describe an automated, im...

متن کامل

Binarization of Low Quality Text Using a Markov Random Field Model

Binarization techniques have been developed in the document analysis community for over 30 years and many algorithms have been used successfully. On the other hand, document analysis tasks are more and more frequently being applied to multimedia documents such as video sequences. Due to low resolution and lossy compression, the binarization of text included in the frames is a non trivial task. ...

متن کامل

Phase-Based Binarization of Ancient Document Images

The main defects present in historical documents are darkness, non-uniform clarification, bleed-through and faded characters. To remove these defects binarization method is used. In this paper a phase based binarization method is studied in which phase of ancient document images is preserved. This method is derived in to three steps: preprocessing, main binarization and post processing. In prep...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002